Bayesian ecology in R

Pavel Jakubec
"2016-09-14"

Content

  • Scientific reasoning
    • How we reason
    • Motivated reasoning
  • Bayes' theorem
    • Explanation
    • Terminology
    • Pros & Cons
  • Prior knowledge
    • Example
  • Frequentists vs Bayesians
    • Comparison of binomial tests
  • Normal Linear model
    • Simulation
    • Inspection
    • Conclusions

Reasoning

bayes
Psittacula krameri

  • Rose-ringed parakeet
  • Motivated reasoning & cognitive bias
    bayes

Need for objective and transparent conclusions.

Bayes' theorem

\[ P(A \mid B)= \frac {P(A )P(B \mid A ))}{P(A)} \]

\[ P(\theta \mid y)= \frac {P(\theta )P(y\mid \theta ))}{P(y)} \]

\( \theta \) = Model parameter

y = Actual data

\( P(y)=\int P(\theta)P(y\mid\theta)d\theta \)

Bayesians

Data = Truth, Parameters ~ Probability distribution

bayes
Thomas Bayes (1702 - 1761)

Basic vocabulary

  • Prior - prior probability distribution (our believes)
  • Posterior - posterior probability distribution (in what we should believe after seeing the evidence (the data))
  • Marginal Posterior - probability distribution of single model parameter
  • Likelihood - \( P(y\mid \theta ) \) - measure of the information we can gain from the data
  • MCMC - Markov chain Monte Carlo (the most well known, but not only sampler from marginal posterior)

Confidence interval vs. Credible interval

  • We are 95% sure that the true mean is within this interval
  • The range of likely values of the parameter (defined as the point estimate + margin of error) with a specified level of confidence.
triplot.normal.knownvariance2(theta.data=4, variance.known=4, n=5, prior.theta=8, prior.variance=4)

plot of chunk unnamed-chunk-2

Basic vocabulary

  • Prior - prior probability distribution (our believes)
  • Posterior - posterior probability distribution (in what we should believe after seeing the evidence (the data))
  • Marginal Posterior - probability distribution of single model parameter
  • Likelihood - \( P(y\mid \theta ) \) - measure of the information we can gain from the data
  • MCMC - Markov chain Monte Carlo (the most well known, but not only sampler from marginal posterior)

Confidence interval vs. Credible interval

  • We are 95% sure that the true mean is within this interval
  • The range of likely values of the parameter (defined as the point estimate + margin of error) with a specified level of confidence.
triplot.normal.knownvariance2(theta.data=4, variance.known=4, n=5, prior.theta=8, prior.variance=1)

plot of chunk unnamed-chunk-3

Prior knowledge

Cancer case


Positive cancer test \( \neq \) cancer

# Variables
TP <- 0.9 #True Positive: 90%
FP <- 0.1 #False Positive: 10%

\[ P(cancer \mid positive test)= \frac {P(cancer )P(positive test \mid cancer))}{P(cancer )P(positive test \mid cancer))+P(no cancer)P(no cancer \mid positive test)} \]

prior <- 0.01 #Prior (prevalence of cancer in population): 1%

# Bayesian interpretation of the test
result <- (TP*prior)/(TP*prior+FP*(1-prior))

result*100
[1] 8.333333

Pros and cons

  • Bayesians are (more or less) ok with:

    Using prior knowledge
    Low sample size (rare species, lots of NAs, expensive sampling)
    Multiple comparisons (Geldman et al., 2012)
    Good way how to speed up science & research (significant results can be disproved)

  • Upside

    More meaningful and comprehensive inferences (the only exact way how to draw inferences for generalized mixed models (Bolker et al., 2008)

  • Downside

    Priors can disproportionally influence the posterior
    Choosing the right/appropriate prior can be challenging
    Computation heavy (less problem now)
    Garbage in = Garbage out

Exagerated example of two approaches

STORY: We screened Amur Leopard (Panthera pardus orientalis) for presence of dangerous blood parasite. If the true percentage of infected ones is greater then 10% we have to inform authorities and take measures to treat them.

CHALENGE: Amur Leopard is very rare and endangered animal, therefore you should use the smallest sample possible

data <- data          #Hidden data
N <- c(5,10,15,20,40) #Sample size
sub <- list()         #List for storing data values with different N    
tests.freq <- list()  #List for storing results of exact binomial test
tests.bayes <- list() #List for storing results of bayesian version of exact binomial test
for (i in 1:5) {
  sub[[i]] <- data[1:N[i]]
  tests.freq[[i]] <- stats::binom.test (c(length(sub[[i]][sub[[i]]=="1"]), length(sub[[i]][sub[[i]]=="0"])), p=0.1, alternative="greater")
  tests.bayes[[i]] <- BayesianFirstAid::bayes.binom.test(c(length(sub[[i]][sub[[i]]=="1"]), length(sub[[i]][sub[[i]]=="0"])), p=0.1, n.iter = 150000)
}

bayes
Panthera pardus orientalis

Exagerated example of two approaches

tests.freq[[1]]

    Exact binomial test

data:  c(length(sub[[i]][sub[[i]] == "1"]), length(sub[[i]][sub[[i]] ==     "0"]))
number of successes = 1, number of trials = 5, p-value = 0.4095
alternative hypothesis: true probability of success is greater than 0.1
95 percent confidence interval:
 0.01020622 1.00000000
sample estimates:
probability of success 
                   0.2 
tests.freq[[2]]

    Exact binomial test

data:  c(length(sub[[i]][sub[[i]] == "1"]), length(sub[[i]][sub[[i]] ==     "0"]))
number of successes = 2, number of trials = 10, p-value = 0.2639
alternative hypothesis: true probability of success is greater than 0.1
95 percent confidence interval:
 0.03677144 1.00000000
sample estimates:
probability of success 
                   0.2 

Exagerated example of two approaches

tests.freq[[3]]

    Exact binomial test

data:  c(length(sub[[i]][sub[[i]] == "1"]), length(sub[[i]][sub[[i]] ==     "0"]))
number of successes = 3, number of trials = 15, p-value = 0.1841
alternative hypothesis: true probability of success is greater than 0.1
95 percent confidence interval:
 0.05684687 1.00000000
sample estimates:
probability of success 
                   0.2 
tests.freq[[4]]

    Exact binomial test

data:  c(length(sub[[i]][sub[[i]] == "1"]), length(sub[[i]][sub[[i]] ==     "0"]))
number of successes = 4, number of trials = 20, p-value = 0.133
alternative hypothesis: true probability of success is greater than 0.1
95 percent confidence interval:
 0.07135388 1.00000000
sample estimates:
probability of success 
                   0.2 

Exagerated example of two approaches

  • The true proportion of infected animals is 20 %.
data
 [1] 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
[36] 0 0 0 0 1
  • We would need probably ALL Amur Leopards in the World to barely reject frequentist's H0 and do something to save them.

This is linked to statistical power of the test:

h<-pwr::ES.h(0.2,0.1) #true proportion is 20% and H0 proportion is 10%
pwr::pwr.p.test(h=h, power=0.8, sig.level=0.05, alternative = "greater")

     proportion power calculation for binomial distribution (arcsine transformation) 

              h = 0.2837941
              n = 76.76467
      sig.level = 0.05
          power = 0.8
    alternative = greater
  • If the frequentist's H0 would be different, the outcome would be also different, which is not the case for Bayesian variant.

Exagerated example of two approaches

tests.bayes[[1]]

    Bayesian First Aid binomial test

data: c(length(sub[[i]][sub[[i]] == "1"]), length(sub[[i]][sub[[i]] ==     "0"]))
number of successes = 1, number of trials = 5
Estimated relative frequency of success:
  0.26 
95% credible interval:
  0.020 0.59 
The relative frequency of success is more than 0.1 by a probability of 0.886 
and less than 0.1 by a probability of 0.114 
tests.bayes[[2]]

    Bayesian First Aid binomial test

data: c(length(sub[[i]][sub[[i]] == "1"]), length(sub[[i]][sub[[i]] ==     "0"]))
number of successes = 2, number of trials = 10
Estimated relative frequency of success:
  0.24 
95% credible interval:
  0.039 0.48 
The relative frequency of success is more than 0.1 by a probability of 0.909 
and less than 0.1 by a probability of 0.091 

Exagerated example of two approaches

tests.bayes[[3]]

    Bayesian First Aid binomial test

data: c(length(sub[[i]][sub[[i]] == "1"]), length(sub[[i]][sub[[i]] ==     "0"]))
number of successes = 3, number of trials = 15
Estimated relative frequency of success:
  0.22 
95% credible interval:
  0.055 0.43 
The relative frequency of success is more than 0.1 by a probability of 0.931 
and less than 0.1 by a probability of 0.069 
tests.bayes[[4]]

    Bayesian First Aid binomial test

data: c(length(sub[[i]][sub[[i]] == "1"]), length(sub[[i]][sub[[i]] ==     "0"]))
number of successes = 4, number of trials = 20
Estimated relative frequency of success:
  0.22 
95% credible interval:
  0.071 0.40 
The relative frequency of success is more than 0.1 by a probability of 0.948 
and less than 0.1 by a probability of 0.052 

Normal "Linear" Model - simulation

Michaelis-Menten curve: \[ f(x) = \frac{ax}{b+x} \]

##Data simulation
#Parameters
set.seed(1337) # non-random generation
sigma <- 5 # standard deviation of the residuals
a <- 30 # asymptote 
b <- 25 # half-maximum
#Simulation part
x <- rep(seq(0,30,5),7)# sample values of the covariate ()
y <- rnorm(x, mean=micmen(x, a=a,b=b), sd=sigma) #

plot of chunk unnamed-chunk-14

Normal "Linear" Model - simulation

Developmental rate ~ Temperature

rm(list=ls())
##Data simulation
#Parameters
set.seed(1337) # non-random generation
sigma <- 5 # standard deviation of the residuals
a <- 30 # intercept 
b <- -1 # slope
#Simulation part
x <- rep(seq(0,30,5),7)# Temperature (independent variable)
yhat <- a+b*x
y <- rnorm(x, mean=yhat, sd=sigma) # Rate (dependent variable)

plot of chunk unnamed-chunk-16

Normal "Linear" Model - Model + Model inspection

# Model
mod <- lm(y~x)

plot of chunk unnamed-chunk-18

Normal "Linear" Model - Model inspection

plot of chunk unnamed-chunk-19

Normal "Linear" Model - Bayesian conclusions

When to use arm::sim function:

  • Simple models (lm, glm, lmer, glmer)
  • No prior knowledge
  • We need flat prior distributions
nsim <- 5000
bsim <- arm::sim(mod, n.sim=nsim)

CrI for coefficients and estimated residual standard deviation \( \hat{\sigma} \)

apply(coef(bsim), 2, quantile, prob=c(0.025, 0.975))
      (Intercept)          x
2.5%     27.47555 -1.0821241
97.5%    33.33091 -0.7512295
quantile(bsim@sigma, prob=c(0.025, 0.975))
    2.5%    97.5% 
4.801013 7.212846 

plot of chunk unnamed-chunk-23

Normal "Linear" Model - Frequentist conclusions

summary(lm(y~x))

Call:
lm(formula = y ~ x)

Residuals:
     Min       1Q   Median       3Q      Max 
-11.7627  -3.7142  -0.0512   3.5295  14.7567 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  30.3764     1.4856   20.45  < 2e-16 ***
x            -0.9161     0.0824  -11.12 9.41e-15 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 5.768 on 47 degrees of freedom
Multiple R-squared:  0.7245,    Adjusted R-squared:  0.7186 
F-statistic: 123.6 on 1 and 47 DF,  p-value: 9.409e-15

TRUE parameters:
Intercept = 30
Slope = -1
\( \sigma \) = 5

Take home message

References and Acknowledgment

Bolker, B. M. (2008). Ecological models and data in R. Princeton University Press.

Bolker, B. M., Brooks, M. E., Clark, C. J., Geange, S. W., Poulsen, J. R., Stevens, M. H. H., & White, J. S. S. (2009). Generalized linear mixed models: a practical guide for ecology and evolution. Trends in ecology & evolution, 24(3), 127-135.

Gelman, A., Hill, J., & Yajima, M. (2012). Why we (usually) don't have to worry about multiple comparisons. Journal of Research on Educational Effectiveness, 5(2), 189-211.

Korner-Nievergelt, F., Roth, T., von Felten, S., Guélat, J., Almasi, B., & Korner-Nievergelt, P. (2015). Bayesian data analysis in ecology using linear models with R, BUGS, and Stan. Academic Press.

References and Acknowledgment

package version date source
abind 1.4-5 2016-07-21 CRAN (R 3.3.1)
arm * 1.9-1 2016-08-24 CRAN (R 3.3.1)
BayesianFirstAid * 0.1 2016-08-31 Github
circlize * 0.3.8 2016-08-14 CRAN (R 3.3.1)
cluster 2.0.4 2016-04-18 CRAN (R 3.3.1)
coda * 0.18-1 2015-10-16 CRAN (R 3.3.1)
colorspace 1.2-6 2015-03-11 CRAN (R 3.3.1)
devtools * 1.12.0 2016-06-24 CRAN (R 3.3.1)
digest 0.6.10 2016-08-02 CRAN (R 3.3.1)
ggplot2 * 2.1.0 2016-03-01 CRAN (R 3.3.1)
GlobalOptions 0.0.10 2016-04-17 CRAN (R 3.3.1)
gtable 0.2.0 2016-02-26 CRAN (R 3.3.1)
knitr * 1.14 2016-08-13 CRAN (R 3.3.1)
lattice 0.20-33 2015-07-14 CRAN (R 3.3.1)
lme4 * 1.1-12 2016-04-16 CRAN (R 3.3.1)
magrittr 1.5 2014-11-22 CRAN (R 3.3.1)
package version date source
MASS * 7.3-45 2016-04-21 CRAN (R 3.3.1)
Matrix * 1.2-6 2016-05-02 CRAN (R 3.3.1)
memoise 1.0.0 2016-01-29 CRAN (R 3.3.1)
minqa 1.2.4 2014-10-09 CRAN (R 3.3.1)
munsell 0.4.3 2016-02-13 CRAN (R 3.3.1)
nlme 3.1-128 2016-05-10 CRAN (R 3.3.1)
nloptr 1.0.4 2014-08-04 CRAN (R 3.3.1)
plyr 1.8.4 2016-06-08 CRAN (R 3.3.1)
pwr * 1.2-0 2016-08-24 CRAN (R 3.3.1)
Rcpp 0.12.6 2016-07-19 CRAN (R 3.3.1)
rjags * 4-6 2016-02-19 CRAN (R 3.3.1)
scales 0.4.0 2016-02-26 CRAN (R 3.3.1)
shape 1.4.2 2014-11-05 CRAN (R 3.3.0)
stringi 1.1.1 2016-05-27 CRAN (R 3.3.0)
stringr 1.1.0 2016-08-19 CRAN (R 3.3.1)
withr 1.0.2 2016-06-20 CRAN (R 3.3.1)